Skip to main content

Find the number of cars per individual

The example below shows how to count the number of passenger cars per individual using the collapse(count) command when in a vehicle dataset, and then how to distribute the numbers on various demographic information by linking the aggregated vehicle dataset to a person dataset with personal information.

To find numbers for individuals who do not own a vehicle, it is important to use a person dataset (total population) as a basis, i.e., to link the number of vehicles into a person dataset (and not the other way around). Persons for whom there is no link from the vehicle dataset will, by definition, be individuals who do not own a vehicle.

 require no.ssb.fdb:32 as db

// Creating person dataset
create-dataset population
import db/BEFOLKNING_KOMMNR_FORMELL 2024-01-01 as municipality
import db/NUDB_BU 2023-08-01 as education
import db/REGSYS_ARB_YRKE_STYRK08 2023-11-16 as occupation
import db/INNTEKT_LONN 2022-12-31 as salary
import db/BEFOLKNING_FOEDSELS_AAR_MND as birth_date

generate county = substr(municipality,1,2)
generate education_level = substr(education,1,1)
generate age = 2024 - int(birth_date/100)


// Creating car dataset
create-dataset cars
import db/KJORETOY_KJT_GRUP 2023-12-31 as vehicle_group
import db/KJORETOY_KJORETOYID_FNR 2023-12-31 as idnr
tabulate vehicle_group
keep if vehicle_group == '101'

// Aggregating to individual level and finding the number of cars per individual
collapse (count) vehicle_group -> num_cars, by(idnr)

// Linking information about the number of cars to the person dataset
merge num_cars into population


// Using the person dataset to create statistics over the number of cars per individual
use population

destring education_level
recode education_level (0/3 = 1 'Not completed secondary')(4/5 = 2 'Secondary education')(6 = 3 'Lower academy education')(7 = 4 'Higher academy education (master level)')(8 = 5 'Research education')(9 = 6 'Not stated')

// Cloning variable for number of cars, and combining number over 2 into a combined category for one variant
clone-variables num_cars -> num_cars_raw
recode num_cars (missing = 0 'None')(3/max = 3 '>')
recode num_cars_raw (missing = 0 'None')

textblock
Total population distributed by number of cars and figures for individuals within managerial occupations ('1') and occupations without significant education requirements ('9')
endblock
tabulate num_cars
piechart num_cars
piechart num_cars if substr(occupation,1,1) == '1'
piechart num_cars if substr(occupation,1,1) == '9'

textblock
Number of cars per person distributed by education level
endblock
tabulate education_level num_cars, rowpct missing
piechart num_cars if education_level == 1
piechart num_cars if education_level == 2
piechart num_cars if inrange(education_level,3,5)
barchart(mean) num_cars_raw, over(education_level)

textblock
Average annual salary and age distributed by number of cars
endblock
barchart(mean) salary, over(num_cars)
barchart(mean) age, over(num_cars)


//Creating geographical divisions

define-labels county_label '03' Oslo '11' Rogaland '15' 'Møre og Romsdal' '18' Nordland '31' Østfold '32' Akershus '33' Buskerud '34' Innlandet '39' Vestfold '40' Telemark '42' Agder '46' Vestland '50' Trøndelag '55' Troms '56' Finnmark
assign-labels county county_label

//Cloning municipality variable and extracting cities from one variable (the rest are placed in a combined category)
clone-variables municipality -> city
replace city = '9999' if !inlist(city,'0301','1103','3201','3301','4204','4601','5001','5501')
recode city ('9999' = '9999' 'Other municipalities')

textblock
Number of cars per person distributed by place of residence, and top 10 municipalities ranked by the proportion of individuals that do not own a car
endblock
tabulate county num_cars, rowpct rowsort(0)
barchart(mean) num_cars_raw, over(county)

tabulate municipality num_cars, rowpct rowsort(0) bottom(10)
tabulate city num_cars, rowpct rowsort(0)
barchart(mean) num_cars_raw, over(city)